server: accept agent cache hints by bittoby · Pull Request #82 · GeniePod/genie-ai-runtime

bittoby · 2026-05-23T08:06:33Z

Summary

parse top-level conversation_id plus nvext.agent_hints.session_id
accept priority, OSL, speculative-prefill, and ephemeral cache TTL metadata
sanitize session ids before using them for persistent KV sessions
echo accepted non-streaming hint metadata under jetson.agent_hints

Verification

git diff --check origin/main..HEAD
cmake -S . -B /tmp/genie-ai-runtime-check -DJLLM_BUILD_SERVER=ON could not complete on this Mac host because CUDA/nvcc is not installed: Failed to find nvcc.

ai-hpc · 2026-06-06T04:15:08Z

reviewd and merged at 7c0b259

Validated on Jetson Orin Nano (built the #82–#84 stack on top of #85, deployed, ran live): nvext.agent_hints parses and echoes under jetson.agent_hints (session_id sanitized, cache_control.ttl "15m" → 900s, priority/osl carried through). Clean, well-guarded parsing (clamped ints, sanitized ids). Thanks @bittoby

server: accept agent cache hints

55554d0

This was referenced May 23, 2026

engine: report kv cache reuse counters #83

Merged

docs(server): document agent cache hints #84

Merged

ai-hpc merged commit 7c0b259 into GeniePod:main Jun 6, 2026

ai-hpc mentioned this pull request Jun 6, 2026

Path C Phase 2: CUDA Graphs for the decode step (env-var-gated) #23

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: accept agent cache hints#82

server: accept agent cache hints#82
ai-hpc merged 1 commit into
GeniePod:mainfrom
bittoby:dynamo-152-runtime-agent-hints

bittoby commented May 23, 2026

Uh oh!

ai-hpc commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

bittoby commented May 23, 2026

Summary

Verification

Uh oh!

ai-hpc commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants